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NISS Data Confidentiality Technical Panel: 
Final Report 



This is the final report of the NISS Data Confidentiality Technical panel which was 
convened by the National Institute of Statistical Sciences (NISS) at the request of the 
National Center for Education Statistics. 

1. Technical Panel Charge and Membership 

1.1. The Charge to the Data Confidentiality Technical Panel 

The National Center for Education Statistics (NCES) asked the National Institute of 
Statistical Sciences (NISS) to convene a data confidentiality technical panel (DCTP) to 
review the NCES current and planned data dissemination strategies for confidential data 
for the following elements: 

• Mandates and directives that NCES make data available. 

• Current and prospective technologies for protecting and accessing confidential 
data, as well as for breaking confidentiality. 

• The various user communities for NCES data and these communities’ uses of the 
data. 

The principal goals of the DCTP were to review the NCES current and planned data 
dissemination strategies for confidential data, assessing whether these strategies are 
appropriate in terms of both disclosure risk and data utility, and then to recommend to 
NCES any changes that the technical panel deems desirable or necessary. 

1.2. Members of the Data Confidentiality Technical Panel 

Alan Karr, NISS (chair) 

George Duncan, Carnegie Mellon University 
Stephen Fienberg, Carnegie Mellon University 
Bobby Franklin, Louisiana Department of Education 
Gerald Gates, Census Bureau (now, private consultant) 

Jerome Reiter, Duke University 

Lynne Stokes, Southern Methodist University 

Rebecca Wright, New Jersey Institute of Technology (now, Rutgers University) 

NISS postdoctoral fellow Anna Oganian provided technical support, including carrying 
out the experiments described in section 5 of this report. 

2. Technical Panel Activities 

The DCTP met in Washington, D.C., on December 8, 2006; all members were present 
(Lynne Stokes joined by teleconference). NCES staff members Paula Knepper and Neil 
Russell made presentations to the DCTP. Deputy Commissioner Jack Buckley, Chief 
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Statistician Marilyn Seastrom, and Special Assistant to the Commissioner Andrew White 
participated in discussions. Subsequent DCTP interactions took place by teleconference 
and e-mail during the calendar year 2007, and were structured around working groups 
addressing the following topics: 

• Transformation of the Original Database to the Restricted Database: Karr, 
Oganian (see sections 4.4 and 5) 

• Transformation of the Restricted Database to the Public Database: Duncan, 
Reiter, Stokes, Wright (see section 4.5) 

• Data Access System Issues: Fienberg, Franklin, Gates, Karr (see section 4.6) 
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3. Problem Formulation 



The recommendations in section 4 require a concrete fonnulation of the dissemination 
problem. Common to all situations considered, and shown in the left-hand panel in figure 
1, are: 

1 . An original database (9, as collected and edited (for instance, to adjust for 
nonresponse bias) by an NCES contractor. 

2. A restricted database St, produced by the data collection contractor from the 
original database using the NCES DataSwap software (see section 5). 
Adjustments may be made to maintain consistency with associated universe 
databases. 



Figure 1: Formulation of the dissemination problem 
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As shown in the right-hand panel in figure 1 , there may in addition be 

3. A public database M (for “masked”) produced from St by application of one or 
more methods for statistical disclosure limitation, which is available to the public 
without licensing (or any other restriction). 
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Each of St and Jit (if the latter exists) is potentially accessible in two conceptually distinct 
ways: 

1 . Directly, in the case of St under license from NCES, and in the case of Jit by 
download from an NCES web site. Any statistical analysis may be performed on 
either St or Jit. 

2. Electronically, by means of a data access system (DAS), to which users submit 
queries specifying statistical analyses to be perfonned on St or Jit. 

As the DCTP understands information received from NCES, NCES is committed to 
access by license to St in all cases, and is anxious to provide DAS access to St and/or Jit 
if confidentiality is not threatened. Consequently, for each NCES data collection, three 
decisions are necessary: 

1 . Whether and under what circumstances to allow DAS access to St, as well as the 
nature of such access. 

2. Whether to produce and make available a public database Jit. 

3. If there is a public database Jit, whether and under what circumstances to allow 
DAS access to it, as well as the nature of the access. 
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4. Principal Recommendations 

Prior to stating its recommendations, the DCTP strongly commends NCES for the 
attention and care with which it approaches data confidentiality questions and for its 
willingness to balance disclosure risk and data utility. 

Overall Recommendation : As an overall recommendation, the DCTP strongly 
recommends that NCES continue to treat the restricted database SI as “ ground truth ” in 
the sense that all NCES analyses and publications are based on it rather than on 0. In 
particular, this ensures consistency between internal and external analyses. 

4.1. Access to Restricted Databases 

Recommendation 1 : The DCTP recommends that, insofar as practical, access to 
restricted databases, whether directly or by DAS, be under license from NCES. 

Elaboration - The resultant structure is shown in figure 2. A DAS accessible only 
to licensed users has two compelling strengths: 

1 . A data access system can be of unlimited statistical power, with full scripting 
capability and multiple user interfaces, including graphical interfaces. In fact, a 
licensed DAS of unlimited power would obviate the need for physical transfer of 
data from NCES to licensees, eliminating security and monitoring issues. Of 
course, a DAS of “unlimited statistical power” might be prohibitively expensive 
and complex to create and maintain. 

2. By recording and analyzing queries processed by the DAS, NCES would have a 
window into usage to its data that does not exist currently, and which could 
inform the design and improve the quality of future data collections (see 
Recommendation 7). 

The DCTP acknowledges that a licensed DAS poses issues of authentication and 
encryption, but believes that current technologies are adequate to deal with them. 

This recommendation does not accommodate a public DAS operating on St, which is one 
access model under consideration — and in one case, in operation — by NCES. The DCTP 
believes that, given current understanding of the disclosure risks associated with data 
access systems (Gomatam et al. 2005b; Karr et al. 2006); such a DAS would have to be 
severely limited in terms of allowable queries and responses to be deemed safe, but might 
still be feasible. 

First, the DAS would require query space restrictions in order to address known, related 
problems, which include the following: 

1 . Subsetting of the data. This is an issue for individual queries; for instance, the 
mean income of a small number of subjects is more infonnative about individual 
incomes than the mean income for a large number of subjects. More subtly 
(Oganian, Reiter, and Karr 2009), it is also an issue of query interaction-, by 
comparing the results of two queries on subsets of the data differing by one 
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